Goto

Collaborating Authors

 network learn


Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation

Liwei Wang, Lunjia Hu, Jiayuan Gu, Zhiqiang Hu, Yue Wu, Kun He, John Hopcroft

Neural Information Processing Systems

In this work, we move a tiny step towards a theory and better understanding of the representations. Specifically, we study a simpler problem: How similar are the representations learned by two networks with identical architecture but trained from different initializations. We develop a rigorous theory based on the neuron activation subspace match model.


Peripheral Vision Transformer

Neural Information Processing Systems

Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions. In this work, we take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition. We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data. We evaluate the proposed network, dubbed PerViT, on ImageNet-1K and systematically investigate the inner workings of the model for machine perception, showing that the network learns to perceive visual data similarly to the way that human vision does. The performance improvements in image classification over the baselines across different model sizes demonstrate the efficacy of the proposed method.


Emergent Riemannian geometry over learning discrete computations on continuous manifolds

Brandon, Julian, Chadwick, Angus, Pellegrino, Arthur

arXiv.org Machine Learning

Many tasks require mapping continuous input data (e.g. images) to discrete task outputs (e.g. class labels). Yet, how neural networks learn to perform such discrete computations on continuous data manifolds remains poorly understood. Here, we show that signatures of such computations emerge in the representational geometry of neural networks as they learn. By analysing the Riemannian pullback metric across layers of a neural network, we find that network computation can be decomposed into two functions: discretising continuous input features and performing logical operations on these discretised variables. Furthermore, we demonstrate how different learning regimes (rich vs. lazy) have contrasting metric and curvature structures, affecting the ability of the networks to generalise to unseen inputs. Overall, our work provides a geometric framework for understanding how neural networks learn to perform discrete computations on continuous manifolds.


Unveiling the Training Dynamics of ReLU Networks through a Linear Lens

Ye, Longqing

arXiv.org Artificial Intelligence

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning mechanisms. In this work, we propose a novel analytical framework that recasts a multi-layer ReLU network into an equivalent single-layer linear model with input-dependent "effective weights". For any given input sample, the activation pattern of ReLU units creates a unique computational path, effectively zeroing out a subset of weights in the network. By composing the active weights across all layers, we can derive an effective weight matrix, $W_{\text{eff}}(x)$, that maps the input directly to the output for that specific sample. We posit that the evolution of these effective weights reveals fundamental principles of representation learning. Our work demonstrates that as training progresses, the effective weights corresponding to samples from the same class converge, while those from different classes diverge. By tracking the trajectories of these sample-wise effective weights, we provide a new lens through which to interpret the formation of class-specific decision boundaries and the emergence of semantic representations within the network.



Peripheral Vision Transformer

Neural Information Processing Systems

Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions. In this work, we take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition. We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data. We evaluate the proposed network, dubbed PerViT, on ImageNet-1K and systematically investigate the inner workings of the model for machine perception, showing that the network learns to perceive visual data similarly to the way that human vision does.


Leaning by Combining Memorization and Gradient Descent

Neural Information Processing Systems

We have created a radial basis function network that allocates a new computational unit whenever an unusual pattern is presented to the network. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which memorizes the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using back-propagation and uses a comparable number of synapses.


Pixel Perfect: ESRGAN-powered High-Resolution Image Upscaling Platform

#artificialintelligence

The MLOps-based application is designed to upscale images by a factor of 4 using ESRGAN, a deep learning-based technique for image super-resolution. The application is hosted on Render's web services and is implemented as a Flask-based API. Users can upload their low-resolution images to the API, and the application will use ESRGAN to upscale them to four times their original resolution. The API is optimised for scalability and can simultaneously handle large volumes of image requests. With this application, users can easily enhance the quality of their images and produce high-resolution versions for their various use cases.


Machine Learning: Learn By Building Web Apps in Python

#artificialintelligence

Machine Learning: Learn By Building Web Apps in Python - Learn basic to advanced Machine Learning algorithms by creating web applications using Flask!! Machine learning is a branch of artificial intelligence (AI) focused on building applications that learn from data and improve their accuracy over time without being programmed to do so. In data science, an algorithm is a sequence of statistical processing steps. In machine learning, algorithms are'trained' to find patterns and features in massive amounts of data in order to make decisions and predictions based on new data. The better the algorithm, the more accurate the decisions and predictions will become as it processes more data. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts.


A Brief Introduction to Facial Recognition (Part 1)

#artificialintelligence

One of the most interesting avenues unlocked by Artificial Intelligence and Analytics is Facial Recognition, powered by AI & ML algorithms. We see this application in use on a daily basis- in smartphones, security stations etc. In this introductory blog, I will provide a quick walkthrough of the technology involved in Facial Recognition. Facial Recognition is a recognition technique used to detect faces of individuals whose images are saved in the data set. Despite the fact that other methods of identification may be more accurate, Facial Recognition has always remained a significant research point because of its non-invasive nature and its ease of use. There are various algorithms that can perform Facial Recognition, but their accuracy might vary.